Splitting source files between CUDA and CPU code. #172

erictzeng · 2014-02-27T02:47:31Z

Addresses #152. Tests pass.

@Yangqing, I modeled this after some of the layer files that were already split between .cpp and .cu, but I'd appreciate a sanity check just to ensure that this way of splitting things isn't going to cause much pain and misery in the future. :)

sguada · 2014-02-27T02:58:03Z

@erictzeng congrats, it compiles, build and pass the tests in OSX 10.7

sguada · 2014-02-27T03:07:55Z

@erictzeng Something weird happened, the second time multinomial_logistic_loss_layer test failed. This could be a problem of the test itself @Yangqing?

$ ./build/src/caffe/test/test_multinomial_logistic_loss_layer.testbin 
Cuda number of devices: 1
Current device id: 0
[==========] Running 2 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from MultinomialLogisticLossLayerTest/0, where TypeParam = float
[ RUN      ] MultinomialLogisticLossLayerTest/0.TestGradientCPU
[       OK ] MultinomialLogisticLossLayerTest/0.TestGradientCPU (284 ms)
[----------] 1 test from MultinomialLogisticLossLayerTest/0 (284 ms total)

[----------] 1 test from MultinomialLogisticLossLayerTest/1, where TypeParam = double
[ RUN      ] MultinomialLogisticLossLayerTest/1.TestGradientCPU
[       OK ] MultinomialLogisticLossLayerTest/1.TestGradientCPU (4 ms)
[----------] 1 test from MultinomialLogisticLossLayerTest/1 (4 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 2 test cases ran. (288 ms total)
[  PASSED  ] 2 tests.
$ ./build/src/caffe/test/test_multinomial_logistic_loss_layer.testbin 
Cuda number of devices: 1
Current device id: 0
[==========] Running 2 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from MultinomialLogisticLossLayerTest/0, where TypeParam = float
[ RUN      ] MultinomialLogisticLossLayerTest/0.TestGradientCPU
./src/caffe/test/test_gradient_check_util.hpp:124: Failure
The difference between computed_gradient and estimated_gradient is 0.026286721229553223, which exceeds threshold_ * scale, where
computed_gradient evaluates to -1.9747008085250854,
estimated_gradient evaluates to -2.0009875297546387, and
threshold_ * scale evaluates to 0.020009875297546387.
debug: (top_id, top_data_id, blob_id, feat_id)=-1,-1,0,38
[  FAILED  ] MultinomialLogisticLossLayerTest/0.TestGradientCPU, where TypeParam = float (317 ms)
[----------] 1 test from MultinomialLogisticLossLayerTest/0 (317 ms total)

[----------] 1 test from MultinomialLogisticLossLayerTest/1, where TypeParam = double
[ RUN      ] MultinomialLogisticLossLayerTest/1.TestGradientCPU
[       OK ] MultinomialLogisticLossLayerTest/1.TestGradientCPU (3 ms)
[----------] 1 test from MultinomialLogisticLossLayerTest/1 (3 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 2 test cases ran. (320 ms total)
[  PASSED  ] 1 test.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] MultinomialLogisticLossLayerTest/0.TestGradientCPU, where TypeParam = float

shelhamer · 2014-02-27T03:09:45Z

@sguada it's just a precision error. This test was relaxed in the boost-eigen branch by slightly increasing epsilon 59584ad2223a27ed5090ee13b2f452f0a5c09eff.

Note that this does not include the MKL/non-MKL merge yet, which is what is broken on OSX. However, we expect the combination of this and #165 to fix #122.

sguada · 2014-02-27T03:12:32Z

@shelhamer I think 0.02 is quite big for a precision error.

shelhamer · 2014-02-27T03:17:22Z

I'm quoting a previous conversation with @Yangqing, who said it was safe to ignore. Thank you for bringing up the question–we should aim for better test documentation.

mavenlin · 2014-02-27T03:29:03Z

Alex also said it was safe to ignore.
https://code.google.com/p/cuda-convnet/wiki/CheckingGradients

Yangqing · 2014-02-27T04:45:42Z

The multinomial loss precision should be fine :)

Splitting this way should be fine too.

Yangqing

On Wed, Feb 26, 2014 at 7:29 PM, Lin Min [email protected] wrote:

Alex also said it is safe to ignore.
https://code.google.com/p/cuda-convnet/wiki/CheckingGradients

Reply to this email directly or view it on GitHubhttps://github.com//pull/172#issuecomment-36206678
.

Split source files between CUDA and CPU code. Pave the way for #3 and #122.

shelhamer · 2014-02-27T05:15:09Z

@erictzeng let's figure out #165 now. I'm doing a rebase, then we can check what's what.

sguada · 2014-02-27T06:06:40Z

@shelhamer @mavenlin thanks for the pointers. I think test should consistently pass or fail, that's part of the automatic testing. So please adjust the epsilon if it needs to be adjusted.
It will be nice to get a summary of all the test at the bottom that counts all the test that passed or failed. Not sure if that will be easy, but it will be helpful.

shelhamer · 2014-02-27T06:13:44Z

Sergio, it is already adjusted–see the commit I linked–but that change just hasn't made it in yet. I'm trying to bring it in now.

Testing could be improved by fixing #173 and #174.

sguada · 2014-02-27T06:17:23Z

Thanks Evan, I would think about how to test the mat wrapper as mentioned in #173 quickly

tdomhan · 2014-02-27T17:01:29Z

I'm wondering: has there been work done to compile it without the cuda parts? From quickly skimming the changes(and correct me if I'm wrong), it looks like the classes are split up, but both CPU and the GPU code need to be compiled. In any case, this is of course a very useful first step for building a CPU only version.

erictzeng · 2014-02-27T19:22:46Z

You're right in that as of right now, CPU and GPU still have to be compiled together. Clean separation of the two so that Caffe can be compiled without GPU is one of our short-term goals, though, so consider this as just step 1 in an ongoing process!

Split source files between CUDA and CPU code. Pave the way for BVLC#3 and BVLC#122.

Added new GaussianStatic dummy layer

fix the bug in input reshape of cuDNN

Splitting source files between CUDA and CPU code.

b17ac66

shelhamer added a commit that referenced this pull request Feb 27, 2014

Merge pull request #172 from erictzeng/split_cuda

40a1548

Split source files between CUDA and CPU code. Pave the way for #3 and #122.

shelhamer merged commit 40a1548 into BVLC:dev Feb 27, 2014

shelhamer mentioned this pull request Feb 27, 2014

Split CUDA code (*.cu) from CPU code (*.cpp). #152

Closed

5 tasks

kloudkl mentioned this pull request Mar 17, 2014

How to run a pretrained model on CPU-only machine #211

Closed

sguada mentioned this pull request Mar 18, 2014

Improved matcaffe #223

Merged

mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014

Merge pull request BVLC#172 from erictzeng/split_cuda

adac28e

Split source files between CUDA and CPU code. Pave the way for BVLC#3 and BVLC#122.

lopho pushed a commit to lopho/caffe that referenced this pull request Jun 29, 2016

Merge pull request BVLC#172 from amirgholami/gaussianstatic

c374dd4

Added new GaussianStatic dummy layer

wk910930 pushed a commit to wk910930/caffe that referenced this pull request Jun 21, 2017

Merge pull request BVLC#172 from yjxiong/bug_fix/cudnn_reshape

21c1471

fix the bug in input reshape of cuDNN

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Splitting source files between CUDA and CPU code. #172

Splitting source files between CUDA and CPU code. #172

erictzeng commented Feb 27, 2014

sguada commented Feb 27, 2014

sguada commented Feb 27, 2014

shelhamer commented Feb 27, 2014

sguada commented Feb 27, 2014

shelhamer commented Feb 27, 2014

mavenlin commented Feb 27, 2014

Yangqing commented Feb 27, 2014

shelhamer commented Feb 27, 2014

sguada commented Feb 27, 2014

shelhamer commented Feb 27, 2014

sguada commented Feb 27, 2014

tdomhan commented Feb 27, 2014

erictzeng commented Feb 27, 2014

Splitting source files between CUDA and CPU code. #172

Splitting source files between CUDA and CPU code. #172

Conversation

erictzeng commented Feb 27, 2014

sguada commented Feb 27, 2014

sguada commented Feb 27, 2014

shelhamer commented Feb 27, 2014

sguada commented Feb 27, 2014

shelhamer commented Feb 27, 2014

mavenlin commented Feb 27, 2014

Yangqing commented Feb 27, 2014

shelhamer commented Feb 27, 2014

sguada commented Feb 27, 2014

shelhamer commented Feb 27, 2014

sguada commented Feb 27, 2014

tdomhan commented Feb 27, 2014

erictzeng commented Feb 27, 2014